Video Semantic Segmentation


Video semantic segmentation is the process of segmenting objects in videos into different classes or categories.

SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM

Add code
Feb 03, 2026
Viaarxiv icon

Multi-Objective Optimization for Synthetic-to-Real Style Transfer

Add code
Feb 03, 2026
Viaarxiv icon

MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos

Add code
Feb 02, 2026
Viaarxiv icon

Sem-NaVAE: Semantically-Guided Outdoor Mapless Navigation via Generative Trajectory Priors

Add code
Feb 01, 2026
Viaarxiv icon

InspecSafe-V1: A Multimodal Benchmark for Safety Assessment in Industrial Inspection Scenarios

Add code
Jan 29, 2026
Viaarxiv icon

LEMON: How Well Do MLLMs Perform Temporal Multimodal Understanding on Instructional Videos?

Add code
Jan 27, 2026
Viaarxiv icon

PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation

Add code
Jan 22, 2026
Viaarxiv icon

Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search

Add code
Jan 20, 2026
Viaarxiv icon

See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval

Add code
Jan 14, 2026
Viaarxiv icon

CroBIM-V: Memory-Quality Controlled Remote Sensing Referring Video Object Segmentation

Add code
Jan 17, 2026
Viaarxiv icon